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Abstract 

This  paper  reviews  nine  available  transcription  and  annotation  tools,  considering  in  particular  the  special  difficulties  arising  from  tran¬ 
scribing  and  annotating  multi-party,  multi-modal  dialogue.  Tools  are  evaluated  as  to  the  ability  to  support  the  user’s  annotation  scheme, 
ability  to  visualize  the  form  of  the  data,  compatibility  with  other  tools,  flexibility  of  data  representation,  and  general  user-friendliness. 


1.  Introduction 

As  the  range  and  variety  of  language  resources  develop, 
there  is  an  increasing  need  for  broader  and  more  flexi¬ 
ble  tools  to  support  the  unique  needs  of  multiple  domains. 
The  Mission  Rehearsal  Exercise  Corpus  (MREC)  (Robin¬ 
son  et  al.,  2004)  presents  particular  challenges  to  existing 
transcription  and  annotation  tools,  as  it  consists  of  primarily 
multi-party  military  training  simulation  dialogues  includ¬ 
ing  human-human  radio  dialogue,  and  dialogue  between 
human  and  multiple  virtual  agents  in  the  MRE  scenario. 
The  MREC  is  multi-modal  in  two  senses:  both  in  the  sense 
of  audio  and  visual  data  incorporating  gesture,  as  well  as 
in  the  sense  of  different  modalities  within  a  scenario  (radio 
and  face  to  face  conversation).  (Traum,  2001).  As  other  re¬ 
views  have  focused  on  the  former  issues  of  multi-modality 
(c.f.  (Bernsen  et  al.,  2002)),  we  focus  this  review  on  the 
special  issues  and  problems  of  the  latter. 

The  following  transcription  and  annotation  tools  were 
evaluated:  Praat,  Transcriber,  TASX,  Anvil,  MMAX,  Dia- 
logueTool,  ILSR  the  NITE  Workbench  and  DAT.  We  eval¬ 
uated  each  of  these  tools  as  to  suitability  for  several  tasks, 
including:  transcription  from  audio  or  video,  annotation  of 
speakers  and  addressees,  several  types  of  dialogue  acts,  and 
dependent  reference.  We  evaluated  each  tool  along  several 
dimensions,  including:  Input/Output  flexibility.  Portabil¬ 
ity,  Source  Code  availability.  Flexibility  in  Coding  Scheme, 
Range  of  Markables,  Audio/Visual  Playback,  Visual  In¬ 
terface,  and  User  support.  In  addition  to  the  tools’  per¬ 
formance  under  our  basic  criteria,  the  nature  of  the  cor¬ 
pus  presents  some  special  problems  which  were  not  easily 
solved  in  any  of  the  tools  reviewed,  including  many  speak¬ 
ers  within  an  interaction,  and  large  amounts  of  overlapping 
speech  and  dialogues. 

Part  of  MREC  includes  radio  simulation  data  consisting 
of  multiple  channels  of  speech  including  a  large  number  of 
participants  (over  35)  engaged  in  a  common  overall  mis¬ 
sion.  While  some  participants  have  frequent  contributions. 
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a  number  contribute  only  occasionally.  This  number  and 
variation  in  speakers  (as  well  as  the  occasional  challenge 
of  identifying  speakers)  presents  problems  for  tools  that  al¬ 
locate  individual  tiers  for  each  speaker,  as  the  number  of 
tiers  either  grows  unwieldy  or  is  limited  by  the  tool.  While 
the  ability  to  deal  with  overlaps  in  some  form  seems  ba¬ 
sic  to  any  spoken  dialogue,  the  more  frequent  potential  for 
overlaps  in  multiparty  dialogue  increases  the  necessity  it  be 
done  gracefully,  and  that  it  be  able  to  include  any  associated 
annotation.  Furthermore,  multi-party  dialogue  presents  the 
further  problem  of  overlapping  dialogues  (where  individ¬ 
ual  overlaps  between  speakers  do  not  conflict,  because  of 
different  participants  involved  in  the  different  dialogues). 

2.  Annotation  Requirements 

In  our  study  we  want  to  transcribe  and  annotate  audio 
sessions  of  simulations  in  the  MRE  (human-computer  in¬ 
teraction)  and  MRE-lite  environments  (human-human  in¬ 
teraction).  There  are  two  goals  of  annotating  the  sessions: 
first,  to  construct  a  large  corpus  with  which  to  test  different 
theories  and  second,  to  use  the  corpus  and  theory  testing  to 
improve  the  performance  of  the  MRE  system. 

Several  factors  influence  the  choice  of  annotation  tool. 
First,  the  tool  must  be  able  to  support  the  user’s  annota¬ 
tion  scheme.  Second,  the  tool  must  be  user-friendly  and 
possibly  compatible  with  other  tools.  For  our  purposes  we 
require  a  set  of  tools  that  can  aid  an  annotator  with  tran¬ 
scribing  data  from  audio  files  and  possibly  even  video  files. 
After  a  file  is  transcribed  it  needs  to  be  annotated.  In  our 
study  we  want  to  annotate  dialogue  acts  and  reference  be¬ 
tween  entities.  Annotating  dialogue  acts  involves  recog¬ 
nizing  and  marking  (or  “tagging”)  utterances  with  different 
codes  which  represent  the  actions  performed  in  the  conver¬ 
sation.  Often  there  is  more  than  one  code  per  utterance. 
Reference  is  the  study  of  the  relationships  between  entities 
in  a  discourse.  Thus  annotating  reference  involves  recog¬ 
nizing  and  marking  these  entities  (usually  noun  phrases) 
and  then  marking  the  relationship  between  an  entity  and  a 
past  entity.  Reference  resolution  is  important  to  a  dialogue 
system  because  failure  to  resolve  entities  correctly  can  lead 
to  confusion  between  the  speakers. 
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Consider  the  dialogue  fragment  below  from  our  MRE 
corpus.  The  Lieutenant  and  Sergeant  are  standing  in  front 
of  a  crash  site  with  a  damaged  car,  a  Humvee,  a  boy  lying 
on  the  ground,  and  a  woman  hunched  over  him: 

14  Sgt  This  woman  and  her  son  came  from  the 

side  street  and  our  driver  didn’t  see  them 

15  LT  can  we  medcvac  him  out  of  here'? 

There  are  a  number  of  dialogue  acts  performed:  utter¬ 
ance  14  is  an  assertion  and  15  is  an  information  request. 
But  14  also  ends  the  turn,  while  15  is  an  acknowledgment 
of  14.  At  the  reference  level,  the  entities  for  both  utterances 
are  listed  in  italics.  In  14,  Her  son  is  a  bridging  reference 
back  to  this  woman.  Our  driver  refers  to  the  Humvee  in  the 
scene  and  them  refers  back  to  the  compound  woman  and 
her  son.  In  15,  we  refers  inclusively  to  the  LT  and  Sgt  (and 
perhaps  some  of  the  units  to  which  they  belong).  Him  is 
ambiguous,  though  most  likely  to  refer  to  the  injured  boy 
in  view.  Here  refers  to  the  local  setting. 

Given  the  requirements  for  each  type  of  annotation  and 
the  need  to  make  things  as  easy  for  annotators  as  possible, 
we  outlined  several  issues  to  rate  each  annotation  tool. 

Input/Output  flexibility  What  is  the  data  format  for  the 
tool’s  input?  For  audio  files  this  may  require  convert¬ 
ing  our  files  to  a  different  format  just  to  use  the  tool. 
Also,  is  the  output  from  the  tool  compatible  with  other 
tools  and/or  easily  readable  so  one  can  check  the  an¬ 
notators’  work? 

Portability  Can  the  tool  be  used  on  different  operating 
systems  (such  as  Windows  or  Linux)?  Also,  does  the 
tool  require  special  packages? 

Source  Code  Does  the  tool  come  with  source  code  so  it 
can  be  altered?  This  is  useful  for  possibly  extending 
or  modifying  the  coding  scheme  offered  by  the  tool, 
or  altering  the  display  to  make  it  more  user-friendly. 

Audio/Visual  interface  Does  the  tool  offer  an  easy-to-use 
method  for  playing  sections  of  audio  (or  video)  and 
segmenting  sections?  Can  large  audio  files  be  handled 
by  the  tool?  Also,  can  it  play  back  a  sentence’s  corre¬ 
sponding  audio  file  to  check  for  intonation. 

Comments  Does  the  tool  allow  the  user  to  make  com¬ 
ments  or  notes  on  their  annotation? 

Flexibility  in  Coding  Scheme  Does  the  tool  require  using 
its  own  scheme  or  can  one  specify  a  different  one? 

Marking  (Annotation  Only)  How  much  can  the  tool 
mark?  Just  words,  or  also  groups  of  words  or  acts? 
Can  it  also  mark  segments  of  sentences  or  just  an  en¬ 
tire  sentence?  Is  it  possible  to  mark  discontinuous  acts 
(with  material  in  the  middle  that  is  not  part  of  the  act)? 

Viewing  work  Does  the  tool  have  a  large  enough  display 
to  show  the  current  work  and  the  corresponding  codes 
in  a  clear  manner?  Is  earlier  work  visible  as  well? 

User  manual  Does  the  tool  come  with  a  user  manual? 
Having  a  manual  saves  on  training  time  and  allows  an¬ 
notators  a  quick  reference  if  needed. 


Optimally,  one  would  want  a  tool  that  could  do  both 
transcription  and  annotation  but  we  did  not  find  one  that 
one  was  flexible  enough  to  use.  In  fact,  simply  finding  a 
tool  that  can  do  two  different  schemes  of  annotations  as  we 
require  is  quite  difficult.  Thus  we  break  up  our  review  of 
tools  into  two  sections:  transcription  tools  and  annotation 
tools.  For  each  tool,  we  give  a  brief  description  and  a  list  of 
its  main  advantages  and  disadvantages. 

3.  Transcription  Tools 

These  are  tools  used  for  transcribing  a  recorded  session 
of  audio  and/or  video  data.  The  result  is  usually  either 
a  simple  text  or  XML  file  with  a  time-ordered  list  of  the 
dialogue  between  the  session  participants.  Typical  infor¬ 
mation  annotated  for  each  sentence  (or  sub-sentential  unit 
when  possible)  is  the  speaker,  the  start  and  end  time  of  the 
sentence,  and  any  comments  about  the  marking.  In  this  sec¬ 
tion  we  review  three  transcription  tools:  Praat,  Transcriber, 
TASX  and  Anvil. 

3.1.  Praat  (v4.0.43) 

Praat  1  is  a  phonetics  tool  used  for  speech  analysis  and 
synthesis.  It  was  not  originally  intended  for  text  transcrip¬ 
tion  but  rather  for  editing  and  analyzing  sound  files.  It  in¬ 
volves  taking  in  an  audio  file,  then  clicking  on  the  wave¬ 
form  for  marking  start  and  end  segments  then  typing  in 
the  words  for  that  segment.  After  transcription  is  done,  the 
transcribed  text  along  with  the  time  stamp  info  is  saved  in 
a  text  file. 

Praat  is  one  of  the  best  developed  and  flexible  transcrip¬ 
tion  tools  in  that  it  can  run  on  both  Windows  and  Linux 
platforms,  its  output  is  a  simple  text  format,  it  is  compati¬ 
ble  with  the  video  annotation  tool  Anvil,  and  most  of  all,  it 
offers  an  easy  segmentation  interface.  On  the  other  hand, 
since  it  is  primarily  a  phonetics  tool  as  opposed  to  a  dis¬ 
course  markup  tool,  previously  transcribed  segments  are 
hard  to  see,  and  long  segments  are  also  difficult  to  see  in 
their  entirety.  It  also  cannot  handle  overlapping  speech, 
when  two  or  more  speakers  speak  at  the  same  time.  It  is 
possible  to  transcribe  each  speaker  onto  a  separate  track, 
but  the  tool  does  not  merge  each  track.  Adding  a  field  to 
write  comments  would  be  helpful. 

Praat  has  undergone  several  changes  since  the  version 
we  tested  to  make  it  more  user-friendly  but  most  of  the 
problems  we  cited  above  have  not  been  addressed.  How¬ 
ever,  Praat  offers  good  support  as  well  as  source  code  and 
the  authors  say  that  they  can  tailor  the  tool  to  your  needs  if 
necessary. 

3.2.  Transcriber  (vl.4) 

Transcriber2,  is  a  transcription  tool  developed  in  Tcl/Tk 
and  C  extensions.  It  runs  on  various  Unix  systems  as  well  as 
Windows  operating  systems.  It  was  originally  intended  for 
use  in  transcribing  broadcast  news  recordings  so  it  makes  a 
good  match  with  our  domain. 

Using  the  tool  involves  first  segmenting  the  audio  file. 
This  is  done  by  playing  the  file  and  hitting  a  key  upon  hear¬ 
ing  a  segment  break.  This  is  easier  than  in  Praat  since  it 

'Praat:  http://www.fon.hum.uva.nl/praat/  for  more  details 

transcriber:  http://www.etca.fr/CTA/gip/Projets/Transcriber/ 


involves  a  lot  less  clicking  and  segmentation  can  be  per¬ 
formed  while  listening  continuously.  Each  segment  is  then 
transcribed  by  clicking  on  its  waveform  and  entering  the 
text  in  the  top  segment  of  the  display  panel. 

Being  specifically  tailored  for  dialogue  transcription. 
Transcriber  avoids  several  of  the  drawbacks  of  using  Praat 
for  such  a  task.  It  is  able  to  accommodate  different  speakers 
and  has  two  windows  -  a  large  one  to  see  the  entire  dialogue 
and  a  smaller  one  to  do  the  transcribing.  Transcriber  also 
allows  multiple  tracks  for  each  speaker,  and  offers  a  way 
to  annotate  speaker  information  as  well  as  a  field  for  com¬ 
ments.  The  tool  does  not  have  as  a  rich  a  manual  as  Praat 
does,  but  it  is  straightforward  enough  to  use  without  one. 

3.3.  TASX  (v.  alpha  1) 

Another  annotation  tool,  TASX3,  is  a  Java  program  that 
handles  both  audio  and  video,  and  works  for  Linux  and 
Windows.  It  was  originally  intended  to  study  prosody  ac¬ 
quisition.  Like  the  previous  tools,  the  primary  display  win¬ 
dow  consists  of  different  tiers  or  tracks.  One  can  treat  each 
track  as  a  speaker  and  segment  the  track  by  clicking  the 
start  and  end  points.  Input  and  output  are  in  XML  form. 

We  tested  an  early  version  of  TASX  that  did  not  come 
with  good  documentation  or  a  lot  of  the  advantages  the  cur¬ 
rent  version  has.  The  main  disadvantages  of  the  version  we 
tested  involved  ease  of  use.  We  found  that  it  was  hard  to 
view  what  you  were  working  on  and  view  past  work.  In 
addition,  our  annotators  found  that  the  segmentation  oper¬ 
ation  was  unwieldy.  However,  the  latest  version  seems  to 
address  many  of  these  disadvantages.  Other  advantages  in¬ 
clude  ability  to  link  with  Praat  and  source  code  availability. 

3.4.  Anvil  (v3.6) 

Anvil4  (Kipp,  2001)  is  a  JAVA  tool  for  Windows,  Unix, 
and  Mac  that  was  developed  for  video  analysis  of  gesture 
research.  Its  primary  advantages  are  a  rich  tier  system  in 
which  transcribers  can  specify  relationships  between  tiers 
and  its  ease  in  marking  different  types  of  annotations.  Lor 
our  purposes  these  abilities  were  more  than  required,  which 
was  simple  text  transcription.  On  the  other  hand,  with  Anvil 
it  is  difficult  to  view  past  work  and  the  unused  functional¬ 
ity  has  the  potential  to  confuse  annotators.  However,  if  one 
uses  Anvil  for  transcription,  it  could  double  as  a  minor  an¬ 
notation  tool  in  that  different  attributes  can  be  specified  in 
each  tier.  We  used  these  to  annotate  dialogue  acts.  Anvil 
could  be  improved  for  this  task  with  a  larger  display  win¬ 
dow  to  view  words  from  a  segment. 

4.  Annotation  Tools 

After  transcription,  the  next  step  is  annotating  the  texts 
with  our  two  different  coding  schemes:  dialogue  acts  and 
references.  Our  perfect  tool  would  be  one  which  could 
work  on  input  from  our  transcription  tool,  allow  user- 
specified  coding  schemes  so  one  could  use  the  same  tool  for 
both  coding  schemes,  and  finally  present  the  different  an¬ 
notations  in  a  readable  manner.  Another  attribute  of  a  good 
tool  is  the  use  of  “standoff  marking”  which  means  that  there 


3TASX:  http://tasxforce.lili.uni-bielefeld.de/ 
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are  different  files  for  each  code  in  a  scheme  which  makes  it 
easy  to  check  individual  codes  in  a  file.  In  this  section  we 
describe  MMAX,  Dialogue  Tool,  the  ILSP  tool,  NITE,  and 
DAT. 

4.1.  MMAX  (v0.9) 

MMAX5  (Muller  and  Strube,  2001)  runs  in  both  Java 
1.3  and  1.4  and  has  been  tested  successfully  in  Windows 
but  has  had  problems  in  Linux.  It  was  originally  conceived 
for  reference  annotation  but  it  can  be  altered  to  handle  ut¬ 
terance  level  annotations  as  well.  The  underlying  concept 
is  that  you  can  highlight  anything  in  the  text  -  a  word,  se¬ 
ries  of  words,  sentences,  parts  or  groups  of  sentences,  or 
some  combination  of  the  above  (called  a  markable)  and  as¬ 
sign  properties  to  that  markable  (entity).  The  annotation 
scheme  is  user  specified. 

The  greatest  advantage  of  this  tool  is  that  it  can  support 
both  of  our  annotation  schemes  so  we  only  have  to  use  one 
tool.  This  is  easier  for  annotators  since  they  only  have  to 
learn  and  use  one  tool.  Other  advantages  of  MMAX  are  that 
it  has  good  support  from  its  creators  and  a  decent  manual. 

4.2.  DialogueTool 

DialogueTool  (Hardy  et  ah,  2003),  from  the  University 
of  Albany,  runs  on  Windows  machines  and  is  intended  for 
annotating  at  the  utterance  level  and  cannot  be  used  for  an¬ 
notating  words  or  sub-sentential  entities.  The  window  of 
the  tool  is  split  into  two  sections  -  the  dialog  display  which 
highlights  the  current  sentence  being  annotated,  and  a  panel 
of  drop-down  coding  menus.  As  in  MMAX,  it  is  possible 
to  annotate  the  same  sentences  with  multiple  codes,  how¬ 
ever  in  DialogueTool  the  only  thing  you  can  mark  up  are 
sentences  and  not  words.  Clauses  can  be  marked  but  only 
after  the  sentence  has  been  segmented,  a  function  that  Dia¬ 
logueTool  offers. 

If  one  were  to  use  this  tool  for  our  purposes,  it  would 
be  used  for  strictly  annotating  all  sentence-level  codes.  All 
reference  and  sub-sentential  codes  would  have  to  be  done 
in  MMAX  or  some  other  tool.  Like  MMAX,  the  coding 
scheme  is  user-specified.  The  input  to  the  tool  is  a  little 
simpler  than  MMAX  in  that  all  that  is  required  is  a  simple 
text  file  with  each  segment  annotated  with  speaker  informa¬ 
tion.  DialogueTool  could  be  improved  by  being  able  to  tag 
groups  of  utterances 

4.3.  ILSP  Tool 

Another  tool  especially  made  for  reference  annotation 
is  the  ILSP  tool6.  It  uses  the  MATE  reference  annotation 
scheme,  which  is  a  subset  of  our  scheme.  There  are  two 
main  drawbacks  to  the  tool,  first  is  the  lack  of  a  reference 
manual  since  the  tool  is  not  so  straightforward  to  use  and 
second  is  that  you  have  to  use  their  coding  scheme,  it  is  not 
possible  to  alter  it  as  in  MMAX  or  DialogueTool. 

4.4.  NITE  Workbench  (v  2) 

By  design,  the  NITE  Workbench7  addresses  many  of 
our  basic  criteria,  by  including  both  transcription  and  flex- 

5http://www.eml-research.de/english/Research/NLP/Downloads 

6ILSP  Tool:  http://www.ilsp.gr/ 

7NITE  Workbench:  http://nite.nis.sdu.dk 
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ible  annotation  definition  and  levels.  The  tested  version, 
however,  was  buggy  and  its  method  of  segmenting  data  is 
so  complex  as  to  render  it  unusable. 

4.5.  DAT 

The  DAT8  from  the  University  of  Rochester  allows 
playing  sound  files  for  utterances  while  annotating,  which 
is  very  useful  for  checking  intonation  and  inflection  when 
labeling  dialogue  acts.  However  it,  it  does  not  allow 
markup  at  the  sub-sentential  level,  and  also  requires  that 
the  sound  file  for  a  dialog  be  broken  up  into  one  sound  file 
per  utterance.  Source  code  (perl/tk)  is  available  so  the  cod¬ 
ing  scheme  can  be  changed  (though  not  as  easily  as  with  a 
config  file).  Input/Output  was  a  negative,  since  files  are  in  a 
special  SGML  format  (with  no  standoff),  rather  than  XML. 

5.  Summary  of  Tool  Evaluation 

Figure  1  provides  a  quick  description  of  the  properties 
and  advantages  of  the  tools  we  tested  for  easy  comparison. 
Table  entries  marked  with  a  “+”  indicate  that  the  tool  per¬ 
formed  well  in  that  category,  and  a  means  it  could  have 
performed  better.  In  some  cases,  these  categories  aren’t  bi¬ 
nary,  such  as  ease  of  use,  so  we  marked  the  category  based 
on  how  well  it  fit  our  minimum  expectations. 

There  were  many  different  ways  to  pick  the  tools,  but  in 
the  end,  the  factors  we  weighted  the  highest  were  ease  of 
use  by  the  annotators  and  ease  of  import  and  export  of  data. 
As  none  of  the  tools  we  tested  were  capable  of  handling 
all  of  our  needs,  we  opted  to  use  Transcriber  for  general 
transcription,  MMAX  for  coding,  and  Praat  for  prosodic 
analysis,  utilizing  Perl  scripts  to  convert  data  between  for¬ 
mats,  when  necessary.  Transcriber  was  selected  because 
it  offered  easy  playback  and  segmentation  mechanisms. 
MMAX’s  ability  to  support  both  user-defined  annotation 
schemes  and  multiple  levels  of  markables  made  it  the  obvi¬ 
ous  choice. 

While  some  of  the  particular  problems  of  appropriate 
tool  support  encountered  in  the  MREC  development  seem 
specific  to  the  task  domain  (mixed  radio  and  face  to  face  di¬ 
alogues  of  military  scenarios),  similar  challenges  will  need 
to  be  faced  in  other  domains  as  human  language  technolo¬ 
gies  expand  into  increasingly  realistic  discourse  situations 
where  multi-party  dialogue  is  common.  The  common  use 


x DAT  Tool:  http://www.hcrc.ed.ac.uk/  amyi/mate/dat.html 


of  communication  technology  (e.g.  widespread  use  of  cel¬ 
lular  phones)  renders  the  occurrence  of  such  multi-modal 
communication  as  discussed  here  increasingly  common¬ 
place  in  natural  dialogue  situations. 

Acknowledgments 

The  work  described  in  this  paper  was  supported  by  the 
Department  of  the  Army  under  contract  number  DAAD  19- 
99-D-0046.  Any  opinions,  findings  and  conclusions  or  rec¬ 
ommendations  expressed  in  this  paper  are  those  of  the  au¬ 
thors  and  do  not  necessarily  reflect  the  views  of  the  Depart¬ 
ment  of  the  Army.  We  would  like  to  thank  Mary  Harper, 
Tomek  Strzalkowski,  Hilda  Hardy,  Michael  Kipp,  Cristoph 
Mueller  for  advice  and  help  acquiring  and  using  the  tools 
described  above.  We  would  also  like  to  thank  Damon  Davi¬ 
son,  Nathan  Klinedinst  and  Despoina  Theodorou  for  addi¬ 
tional  help  in  evaluating  the  tools. 

6.  References 

Bernsen,  N.O.,  L.  Dybkjr,  and  M.  Kolodnytsky,  2002. 
The  nite  workbench  -  a  tool  for  annotation  of  natu¬ 
ral  interactivity  and  multimodal  data.  In  Third  Interna¬ 
tional  Conference  on  Language  Resources  and  Evalua¬ 
tion  (LREC2002). 

Hardy,  Hilda,  Kirk  Baker,  Helene  Bonneau-Maynard,  Lau¬ 
rence  Devillers,  Sophie  Rosset,  and  Tomek  Strzalkowski, 
2003.  Semantic  and  dialogic  annotation  for  automated 
multilingual  customer  service.  In  Eurospeech-2003 . 
Kipp,  Michael,  2001.  Anvil  -  a  generic  annotation  tool  for 
multimodal  dialogue.  In  7th  European  Conference  on 
Speech  Communication  and  Technology  (Eurospeech). 
Muller,  Christoph  and  Michael  Strube,  2001.  Mmax:  A 
tool  for  the  annotation  of  multi-modal  corpora.  In  2nd 
IJCAI  Workshop  on  Knowledge  and  Reasoning  in  Prac¬ 
tical  Dialogue  Systems. 

Robinson,  Susan,  Bilyana  Martinovski,  Saurabh  Garg,  Jens 
Stephan,  and  David  Traum,  2004.  Issues  in  corpus  de¬ 
velopment  for  multi-party  multi-modal  task-oriented  di¬ 
alogue.  In  Fourth  International  Conference  on  Language 
Resources  and  Evaluation  (LREC  2004). 

Traum,  David  R.,  2001.  Ideas  on  multi-layer  dialogue 
management  for  multi-party,  multi-conversation,  multi¬ 
modal  communication:  Extended  abstract  of  invited  talk. 
In  Computational  Linguistics  in  the  Netherlands  2001: 
Selected  Papers  from  the  Twelfth  CLIN  Meeting. 


